> AI Agents Fundamentals

Budding
planted Jan 8, 2026tended Jan 8, 2026
#ai-agents#fundamentals#architecture#autonomous-systems

AI Agents Fundamentals

🌿 Budding note β€” foundational concepts for autonomous AI systems.

What is an AI Agent?

An AI agent is an autonomous system that:

  1. Perceives its environment through inputs (text, APIs, sensors)
  2. Reasons about what actions to take
  3. Acts on the environment to achieve goals
  4. Learns from feedback to improve over time

Key distinction from chatbots:

Chatbot:    User β†’ LLM β†’ Response
Agent:      User β†’ LLM β†’ [Tools] β†’ Actions β†’ Results β†’ LLM β†’ Response
                    ↑__________________|
                    Autonomous loop

Core Components

1. The Brain (LLM)

The reasoning engine that makes decisions:

# Claude as the agent brain
from anthropic import Anthropic

client = Anthropic()

def agent_think(task: str, context: dict) -> str:
    """Agent reasoning with Claude"""
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=4096,
        system="You are an autonomous agent. Analyze the task and decide what action to take.",
        messages=[{
            "role": "user",
            "content": f"Task: {task}\nContext: {context}\nWhat should I do next?"
        }]
    )
    return response.content[0].text

Popular models for agents:

  • Claude 3.5 Sonnet: Best for complex reasoning, tool use
  • GPT-4: Strong general capabilities
  • Mixtral: Open-source alternative
  • Gemini Pro: Google's multimodal option

Related: Claude Agent Patterns for Claude-specific best practices

2. Memory Systems

Agents need memory to maintain context and learn:

Short-term memory (conversation buffer):

class AgentMemory:
    def __init__(self, max_messages: int = 10):
        self.messages = []
        self.max_messages = max_messages

    def add(self, role: str, content: str):
        self.messages.append({"role": role, "content": content})
        if len(self.messages) > self.max_messages:
            self.messages.pop(0)  # FIFO

    def get_context(self) -> list:
        return self.messages

Long-term memory (vector database):

from qdrant_client import QdrantClient
from sentence_transformers import SentenceTransformer

class LongTermMemory:
    def __init__(self):
        self.client = QdrantClient(":memory:")
        self.encoder = SentenceTransformer('all-MiniLM-L6-v2')

    def store(self, memory: str, metadata: dict):
        """Store experience in vector DB"""
        vector = self.encoder.encode(memory).tolist()
        self.client.upsert(
            collection_name="memories",
            points=[{
                "id": hash(memory),
                "vector": vector,
                "payload": {"text": memory, **metadata}
            }]
        )

    def recall(self, query: str, limit: int = 5):
        """Retrieve relevant memories"""
        vector = self.encoder.encode(query).tolist()
        results = self.client.search(
            collection_name="memories",
            query_vector=vector,
            limit=limit
        )
        return [hit.payload for hit in results]

Deep dive: Agent Memory Systems

3. Tool Use

Agents extend their capabilities through tools:

Tool definition:

from typing import Callable, Dict, Any

class Tool:
    def __init__(
        self,
        name: str,
        description: str,
        function: Callable,
        parameters: Dict[str, Any]
    ):
        self.name = name
        self.description = description
        self.function = function
        self.parameters = parameters

    def execute(self, **kwargs) -> Any:
        return self.function(**kwargs)

# Example tools
web_search = Tool(
    name="web_search",
    description="Search the web for current information",
    function=lambda query: search_api(query),
    parameters={"query": {"type": "string", "required": True}}
)

calculator = Tool(
    name="calculator",
    description="Perform mathematical calculations",
    function=lambda expression: eval(expression),
    parameters={"expression": {"type": "string", "required": True}}
)

Related: Tool Use & Function Calling

4. Planning & Reasoning

Agents use various strategies to plan actions:

ReAct Pattern (Reasoning + Acting):

Thought: I need to find the current weather in Tokyo
Action: web_search("Tokyo weather today")
Observation: Temperature is 18Β°C, partly cloudy
Thought: Now I can answer the user's question
Action: respond("The weather in Tokyo is 18Β°C and partly cloudy")

Chain-of-Thought:

def chain_of_thought_prompt(question: str) -> str:
    return f"""Let's approach this step-by-step:
1. First, identify what information we need
2. Break down the problem into sub-tasks
3. Execute each sub-task
4. Synthesize the results

Question: {question}

Let's begin:"""

Tree of Thoughts (explore multiple reasoning paths):

class ThoughtTree:
    def __init__(self, root_thought: str):
        self.root = {"thought": root_thought, "children": [], "score": 0}

    def expand(self, node: dict, num_branches: int = 3):
        """Generate alternative reasoning paths"""
        for i in range(num_branches):
            child_thought = generate_next_thought(node["thought"])
            child = {
                "thought": child_thought,
                "children": [],
                "score": evaluate_thought(child_thought)
            }
            node["children"].append(child)

    def best_path(self) -> list:
        """Find highest-scoring reasoning path"""
        return traverse_tree(self.root, key=lambda n: n["score"])

Agent Architectures

1. Simple ReAct Agent

Single LLM call with tools:

class ReActAgent:
    def __init__(self, llm, tools: list[Tool]):
        self.llm = llm
        self.tools = {t.name: t for t in tools}

    def run(self, task: str, max_iterations: int = 10):
        """Execute ReAct loop"""
        context = []

        for i in range(max_iterations):
            # Reasoning step
            prompt = self._build_prompt(task, context)
            response = self.llm.generate(prompt)

            # Parse action
            action = self._parse_action(response)
            if action["type"] == "final_answer":
                return action["content"]

            # Execute tool
            tool = self.tools[action["tool"]]
            result = tool.execute(**action["args"])
            context.append({
                "thought": response,
                "action": action,
                "observation": result
            })

        return "Max iterations reached"

2. Multi-Agent Systems

Specialized agents collaborating:

class MultiAgentSystem:
    def __init__(self):
        self.agents = {
            "researcher": ResearchAgent(),
            "writer": WriterAgent(),
            "critic": CriticAgent(),
        }

    async def solve(self, task: str):
        """Collaborative problem solving"""
        # Research phase
        research = await self.agents["researcher"].investigate(task)

        # Writing phase
        draft = await self.agents["writer"].write(research)

        # Review phase
        feedback = await self.agents["critic"].review(draft)

        # Refinement
        final = await self.agents["writer"].revise(draft, feedback)
        return final

Deep dive: Multi-Agent Systems

3. Hierarchical Agents

Manager delegates to specialists:

Manager Agent
    β”œβ”€β”€ Planning Agent (strategy)
    β”œβ”€β”€ Execution Agent (actions)
    └── Monitoring Agent (validation)
class HierarchicalAgent:
    def __init__(self):
        self.manager = ManagerAgent()
        self.specialists = {
            "planner": PlanningAgent(),
            "executor": ExecutionAgent(),
            "monitor": MonitorAgent(),
        }

    async def run(self, goal: str):
        # Manager creates plan
        plan = await self.manager.plan(goal)

        # Delegate to specialists
        results = []
        for step in plan.steps:
            specialist = self.specialists[step.type]
            result = await specialist.execute(step)
            results.append(result)

        # Manager synthesizes
        return await self.manager.synthesize(results)

Agent Types

1. Task Completion Agents

Goal: Complete specific tasks (research, data analysis, code generation)

class TaskAgent:
    async def complete_task(self, task_description: str):
        # Understand task
        requirements = await self.analyze_task(task_description)

        # Break into steps
        plan = await self.create_plan(requirements)

        # Execute steps
        for step in plan:
            result = await self.execute_step(step)
            if not self.validate(result):
                await self.revise_plan(step)

        return self.synthesize_results()

Examples:

  • Research assistant (gather information)
  • Code generator (write functions)
  • Data analyst (SQL queries, visualizations)

2. Conversational Agents

Goal: Natural dialogue with memory and personality

class ConversationalAgent:
    def __init__(self, personality: str):
        self.personality = personality
        self.memory = AgentMemory()

    async def chat(self, user_message: str):
        # Add to memory
        self.memory.add("user", user_message)

        # Generate response with personality
        system_prompt = f"You are {self.personality}"
        response = await self.llm.generate(
            system=system_prompt,
            messages=self.memory.get_context()
        )

        self.memory.add("assistant", response)
        return response

Examples:

  • Customer support bots
  • Personal assistants
  • Tutoring systems

3. Autonomous Agents

Goal: Continuous operation toward long-term goals

class AutonomousAgent:
    def __init__(self, goal: str):
        self.goal = goal
        self.state = {}

    async def run_loop(self):
        """Continuous operation"""
        while not self.goal_achieved():
            # Perceive environment
            observations = await self.observe()

            # Update internal state
            self.state = await self.update_state(observations)

            # Decide next action
            action = await self.decide(self.state)

            # Execute action
            result = await self.execute(action)

            # Learn from result
            await self.learn(action, result)

            await asyncio.sleep(self.tick_rate)

Examples:

  • AutoGPT (recursive task completion)
  • Cryptocurrency trading bots
  • DevOps automation agents

Key Concepts

Agentic Behavior

What makes something truly "agentic":

  1. Autonomy: Self-directed action without human intervention
  2. Reactivity: Responds to environment changes
  3. Proactivity: Takes initiative toward goals
  4. Social ability: Interacts with other agents/humans

Grounding

Connecting agent actions to real-world effects:

def grounded_action(action: str, world_model: WorldState):
    """Ensure action has real effect"""
    # Verify pre-conditions
    if not world_model.check_preconditions(action):
        raise InvalidAction("Preconditions not met")

    # Execute with confirmation
    result = execute(action)

    # Verify post-conditions
    new_state = world_model.update(result)
    if not new_state.matches_expected():
        rollback(action)
        raise ActionFailed("Post-conditions not satisfied")

    return new_state

Tool Calling vs Code Execution

Tool calling: Structured function invocation

# LLM returns structured tool call
{"tool": "web_search", "args": {"query": "Python tutorials"}}

Code execution: Generate and run arbitrary code

# LLM generates code
code = """
import requests
data = requests.get('https://api.example.com').json()
result = [item['name'] for item in data if item['active']]
"""
exec(code)  # ⚠️ Security implications!

See: Agent Security Considerations

Evaluation

Measuring agent performance:

Success Metrics

class AgentMetrics:
    def __init__(self):
        self.tasks_completed = 0
        self.tasks_failed = 0
        self.avg_steps = []
        self.tool_usage = defaultdict(int)

    def task_success_rate(self) -> float:
        total = self.tasks_completed + self.tasks_failed
        return self.tasks_completed / total if total > 0 else 0

    def avg_steps_to_completion(self) -> float:
        return sum(self.avg_steps) / len(self.avg_steps)

    def tool_efficiency(self) -> dict:
        """Which tools are most useful?"""
        total_calls = sum(self.tool_usage.values())
        return {
            tool: count / total_calls
            for tool, count in self.tool_usage.items()
        }

Benchmarks

  • WebArena: Web navigation tasks
  • SWE-bench: Software engineering tasks
  • GAIA: General AI assistant tasks
  • AgentBench: Multi-domain capabilities

Related: Agent Evaluation & Testing

Common Patterns

1. Retry with Refinement

async def retry_with_feedback(agent, task, max_attempts=3):
    """Retry failed tasks with error feedback"""
    for attempt in range(max_attempts):
        try:
            result = await agent.execute(task)
            return result
        except Exception as e:
            if attempt < max_attempts - 1:
                task = f"{task}\n\nPrevious attempt failed: {e}\nPlease try a different approach."
            else:
                raise

2. Human-in-the-Loop

class HumanApprovalAgent:
    async def execute_with_approval(self, action):
        """Require human approval for sensitive actions"""
        if action.is_sensitive():
            print(f"Agent wants to: {action.description}")
            approval = input("Approve? (yes/no): ")
            if approval.lower() != "yes":
                return "Action rejected by human"

        return await action.execute()

3. Self-Reflection

async def self_reflect(agent, task, result):
    """Agent critiques its own work"""
    critique = await agent.llm.generate(f"""
    Task: {task}
    My result: {result}

    Critically evaluate:
    1. Did I fully address the task?
    2. Are there errors or gaps?
    3. How could I improve?
    """)

    if critique.suggests_revision():
        return await agent.revise(result, critique)
    return result

Practical Applications

Customer Support

class SupportAgent:
    async def handle_ticket(self, ticket):
        # Classify issue
        category = await self.classify(ticket.description)

        # Search knowledge base
        kb_results = await self.search_kb(ticket.description)

        # If known solution exists
        if kb_results:
            return self.format_solution(kb_results[0])

        # Otherwise, escalate to human
        return await self.escalate(ticket, reason="No KB match")

Code Assistant

class CodeAgent:
    async def implement_feature(self, spec: str):
        # Generate code
        code = await self.generate_code(spec)

        # Run tests
        test_results = await self.run_tests(code)

        # Fix if tests fail
        while not test_results.passed:
            errors = test_results.errors
            code = await self.fix_code(code, errors)
            test_results = await self.run_tests(code)

        return code

Related: Building Agents with LangChain, Production Agent Deployment

Research Assistant

class ResearchAgent:
    async def research_topic(self, topic: str):
        # Web search
        sources = await self.web_search(topic)

        # Extract information
        facts = []
        for source in sources:
            content = await self.fetch_url(source.url)
            extracted = await self.extract_facts(content, topic)
            facts.extend(extracted)

        # Synthesize report
        report = await self.synthesize(facts)

        # Add citations
        return self.add_citations(report, sources)

Challenges & Limitations

1. Reliability

Agents can fail unpredictably:

  • Tool calls with wrong arguments
  • Infinite loops
  • Context window overflow
  • Hallucinated actions

Mitigation: Agent Evaluation & Testing

2. Cost

LLM API costs accumulate:

# Track token usage
class CostTracker:
    def __init__(self, cost_per_1k_tokens: float):
        self.cost_per_1k = cost_per_1k_tokens
        self.total_tokens = 0

    def add_usage(self, input_tokens: int, output_tokens: int):
        self.total_tokens += input_tokens + output_tokens

    def total_cost(self) -> float:
        return (self.total_tokens / 1000) * self.cost_per_1k

Mitigation: Production Agent Deployment

3. Security

Agents can be exploited:

  • Prompt injection
  • Tool misuse
  • Data exfiltration

Mitigation: Agent Security Considerations

Learning Resources

Frameworks

  • LangChain: Popular Python agent framework
  • LangGraph: Stateful agent orchestration
  • AutoGPT: Autonomous agent template
  • CrewAI: Multi-agent collaboration

See: Agent Frameworks Comparison

Papers

  • "ReAct: Synergizing Reasoning and Acting in Language Models" (2023)
  • "Toolformer: Language Models Can Teach Themselves to Use Tools" (2023)
  • "Generative Agents: Interactive Simulacra of Human Behavior" (2023)
  • "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" (2022)

Courses

  • DeepLearning.AI: "LangChain for LLM Application Development"
  • "Building AI Agents" by Harrison Chase
  • "Prompt Engineering for ChatGPT" by Vanderbilt

Connection Points

Start here:

  • AI Agents MOC β€” Main navigation
  • Tool Use & Function Calling β€” Extending agent capabilities
  • Agent Memory Systems β€” Context management

Deep dives:

Advanced:

>> referenced by (11)

Agent Evaluation and Testing
...t-effective - Consistency: Similar inputs produce similar outputs Related: [[AI Agents Fundamentals]] for core concepts Testing Levels 1. Unit Tests (Tool Level) Test indi...
Agent Frameworks Comparison
...quirements. This guide compares the major options to help you decide. Related: [[AI Agents Fundamentals]] for core concepts Framework Landscape ``` Complexity ↑ β”‚ LangGrap...
Agent Memory Systems
...on - Avoid repeating mistakes - Build knowledge incrementally Related: [[AI Agents Fundamentals]] for core concepts Types of Memory 1. Short-Term Memory (Working Memory...
Agent Security Considerations
...ive API calls - Privilege escalation: Gaining unauthorized access Related: [[AI Agents Fundamentals]] for agent architectures Prompt Injection The primary security risk for age...
AI Agents
...the same board.json. Getting Started New to AI agents? Start here: - [[AI Agents Fundamentals]] 🌿 β€” Core concepts, architectures, and agent types - [[Tool Use and Function Ca...
Building Agents with LangChain
...opic( model="claude-sonnet-4-5-20250929", temperature=0 ) `` Related: [[AI Agents Fundamentals]] and [[Agent Frameworks Comparison]] Quick Start: ReAct Agent ``python fro...
Claude Agent Patterns
...bility - Extended thinking: Deep reasoning with thinking blocks Related: [[AI Agents Fundamentals]] for core concepts Basic Agent Loop ```python from anthropic import Anthrop...
Multi-Agent Systems
...ork - Scale horizontally - Simulate organizational structures Related: [[AI Agents Fundamentals]] for single-agent basics Coordination Patterns 1. Sequential Pipeline...
Production Agent Deployment
...tas - Security hardening - Cost management - Performance optimization Related: [[AI Agents Fundamentals]] and [[Agent Security Considerations]] Architecture Patterns 1. API-Bas...
Production LLM Eval Platforms β€” Full Research Report
...or directional only" inside. Related notes: [[Agent Evaluation and Testing]] Β· [[AI Agents Fundamentals]] Β· [[AI Agents MOC]] --- Synthesis of state-of-the-art across eight interlocki...
Tool Use and Function Calling
...calculators - External services: Email, Slack, payment processors Related: [[AI Agents Fundamentals]] for agent architectures Function Calling vs Tool Use Function calling...